Method of optimizing search system

ABSTRACT

The present disclosure provides a method of optimizing a search system, which relates to a field of data processing, and in particular to a field of data search. The method is implemented to include: determining a first hit rate of a cache unit of the search system for a plurality of user queries; for each element in a first set of elements, determining at least one key element by: generating a plurality of first queries that correspond to the plurality of user queries; determining a second hit rate of the offline cache unit for the plurality of first queries; and determining the element as one of at least one key element in response to determining that a difference between the second hit rate and the first hit rate is less than a difference threshold; and optimizing the search system based on the at least one key element.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is claims priority to Chinese Application No.202011530562.9 filed on Dec. 22, 2020, which is incorporated herein byreference in its entirety.

TECHNICAL FIELD

The present disclosure relates to a field of data processing technology,in particular to a field of data search technology, and morespecifically to a method of optimizing a search system, an electronicdevice, and a computer-readable storage medium.

BACKGROUND

In the era of network information, a search system needs to deal withmassive user queries at all times. In order to improve a response speedand reduce a computational overhead, a cache (sometimes referred to as acache unit herein) in the search system is very important to an overallresource consumption of the search system. When the search systemprocesses the same query again, it may directly find corresponding queryresults from the cache in response to the user query, so as to reduce aprocessing pressure of the search system and improve the response speed.Accordingly, a change in a cache hit rate may lead to a change in theoverall resource consumption of the search system. For example, when thecache hit rate decreases, the computational overhead of the searchsystem may increase. The cache hit rate may be affected by variousfactors. Therefore, there is a need to efficiently determine a factorthat affects the cache of the search system, and then optimize thesearch system.

SUMMARY

Embodiments of present disclosure provide a method of optimizing asearch system, an electronic device, and a computer-readable storagemedium.

According to an aspect, there is provided a method of optimizing asearch system, including: determining a first hit rate of a cache unitof the search system for a plurality of user queries, wherein each userquery is associated with a plurality of elements; for each element in afirst set of elements of the plurality of elements, determining at leastone key element by: generating a plurality of first queriescorresponding to the plurality of user queries, wherein the plurality offirst queries are associated with at least the element; determining asecond hit rate of the cache unit for the plurality of first queries;and determining the element as one of at least one key element, inresponse to determining that a difference between the second hit rateand the first hit rate is less than a difference threshold; andoptimizing the search system based on the at least one key element.

According to another aspect, there is provided an electronic device,including: at least one processor; and a memory communicativelyconnected to the at least one processor, wherein the memory storesinstructions executable by the at least one processor, and theinstructions, when executed by the at least one processor, cause the atleast one processor to implement the method of optimizing a searchsystem according to embodiments of the present disclosure.

According to another aspect, there is provided a non-transitorycomputer-readable storage medium having computer instructions storedthereon, wherein the computer instructions allow a computer to implementthe method of optimizing a search system according to embodiments of thepresent disclosure.

It should be understood that content described in this section is notintended to identify key or important features in the embodiments of thepresent disclosure, nor is it intended to limit the scope of the presentdisclosure. Other features of the present disclosure will be easilyunderstood through the following description.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and other features, advantages and aspects of the embodimentsof the present disclosure will become more apparent in combination withthe drawings and with reference to the following detailed description.The drawings are used to better understand the solution and do notconstitute a limitation to the present disclosure. In the drawings, sameor similar reference numerals indicate same or similar elements.

FIG. 1 shows a schematic diagram of an exemplary environment in whichvarious embodiments of the present disclosure may be implemented.

FIG. 2 shows a flowchart of a method of optimizing a search systemaccording to some embodiments of the present disclosure.

FIG. 3 shows a flowchart of a method of determining at least one keyelement according to some embodiments of the present disclosure.

FIG. 4 shows a schematic diagram of hit rate curves according to someembodiments of the present disclosure.

FIG. 5 shows a flowchart of a method of determining a number of the userqueries hit in a cache unit according to some embodiments of the presentdisclosure.

FIG. 6 shows a flowchart of a method of determining a number of thefirst user queries hit in the cache unit according to some embodimentsof the present disclosure.

FIG. 7 shows a schematic diagram of hit rate curves according to someembodiments of the present disclosure.

FIG. 8 shows a block diagram of an apparatus of optimizing a searchsystem according to some embodiments of the present disclosure.

FIG. 9 shows a block diagram of an electronic device for implementingthe various embodiments of the present disclosure.

DETAILED DESCRIPTION OF EMBODIMENTS

The exemplary embodiments of the present disclosure are described belowwith reference to the drawings, which include various details of theembodiments of the present disclosure to facilitate understanding, andwhich should be considered as merely illustrative. Therefore, thoseordinary skilled in the art should realize that various changes andmodifications may be made to the embodiments described herein withoutdeparting from the scope and spirit of the present disclosure. Inaddition, for clarity and conciseness, descriptions of well-knownfunctions and structures are omitted in the following description.

In the description of the embodiments of the present disclosure, theterm “including” and similar terms should be understood as open-endedinclusion, that is, “including but not limited to”. The term “based on”should be understood as “at least partially based on.” The term “anembodiment,” “one embodiment” or “this embodiment” should be understoodas “at least one embodiment.” The terms “first,” “second,” and the likemay refer to different or the same objects. The following may alsoinclude other explicit and implicit definitions.

As discussed above, when a cache hit rate changes, especially when thecache hit rate decreases, it is desirable to determine a key elementaffecting the cache hit rate. In some schemes, the key element affectingthe cache hit rate may be manually determined by observing a number ofthe user queries received by the search system, a network health status,a data retention time in the cache unit and other indicators,determining whether these indicators are consistent with a change trendof the cache hit rate, and combining the determination with experience.In other schemes, reasons affecting the cache hit rate may also beclassified. For example, data exists in the cache but an access to thedata is unavailable due to network and other reasons; data are missingdue to migration of a storage instance; data do not exist due to anexternal active deletion; an expiration period of data has passed ordata do not exist. However, the schemes described above generallyrequire a determination by an experienced person, so it usually takes acertain time to optimize the search system. Moreover, the schemesdescribed above may not determine the key element from a level of aplurality of elements associated with a user query, so that a reason forthe change in the cache hit rate may not be determined.

Embodiments of the present disclosure propose a technical solution ofdetermining the key element affecting the cache unit and then optimizingthe search system. In this solution, one or more key elements that havea great impact on the hit rate may be determined in a case that, forexample, the cache hit rate significantly changes, by splitting aplurality of elements associated with the user query, determining a hitrate of the cache unit for a first query associated with a singleelement, and comparing the hit rate for the first query with a hit rateof the cache unit for a real and multi-element user query.

In this way, the change in the hit rate may be analyzed in a level ofthe elements forming the user query, so as to efficiently and accuratelydetermine one or more key elements that have a great impact on the cachehit rate, and then optimize the search system based on the key elementsdetermined. In this way, performance of the search system may beimproved in a plurality of aspects such as the hit rate of the cache,the hit rate of the search system, the response time of the searchsystem, and the computational overhead of the search system.

FIG. 1 shows a schematic diagram of an example environment 100 in whichvarious embodiments of the present disclosure may be implemented.

The exemplary environment generally includes a search system 115 and acomputing device 115. In the description of the embodiments of thepresent disclosure, the term “search system” refers to a system forreturning a query result 108 in response to a user query 102. The searchsystem 115 generally includes a cache unit 116 (sometimes referred to asa cache or a cache system). The search system 115 may further include anapplication 114 and a storage unit (not shown) for permanently storingdata.

A plurality of historical user queries and corresponding historicalquery results may be stored in the cache unit 116, so that a real-timequery from a user may be responded quickly. These historical queryresults are corresponding query results retrieved by the search system115 from the storage unit (not shown) for permanently storing datathrough a series of processes. In some embodiments, the series ofprocesses may include, for example: receiving a (historical) user queryrequest (via the application 114, for example); segmenting a textcontained in the user query so as to determine at least one query term;calculating a weight of each query term; generating a query vector basedon the weight and the at least one query term; transmitting an invertedlist of the corresponding query term in the storage unit into a memory;determining a set of query results by an intersection of the invertedlist of the corresponding query term; sorting the query results so as todetermine query results with a high degree of association for an output(via the application 114, for example).

It may be understood that the above series of processes may take a lotof time and may consume a lot of computing resources. Therefore, thesearch system generally firstly searches in the cache unit whether theuser query that has been processed by the application 114 (through asignature algorithm, for example) has a corresponding query result ornot. A query text with high query frequency and the corresponding queryresults determined by the above series of processes are usually storedin the cache unit 116. When the same or different user conducts a userquery containing the same query text again, the search system maydirectly use the corresponding query results in the cache unit 116 torespond to the user query. Only in a case of failing to match the queryresults in the cache unit 116, the above series of processes areexecuted. In this way, a series of processes for a part of user queries,such as segmenting the text, calculating the weight and matching the keyword, may be omitted. Therefore, a search system with an appropriatecache (for example, with a high cache hit rate) may respond to the userquery in time, so as to improve user experience and save a lot ofcomputing resources. It may be understood that each user query may beassociated with a time for conducting the user query.

In some embodiments, considering that the user query has a certaintimeliness, the search system 115 may set a data expiration period forthe user query cached in the cache unit 116 and the associated queryresults, for example, through a user behavior analysis. Data beyond thedata expiration period may not be returned by the cache system 116 tothe application 114. In some embodiments, the data beyond the dataexpiration period may be cleaned or eliminated periodically so as tomaintain a high utilization rate of a storage space and a high responsespeed of the cache unit.

In some embodiments, in the cache unit 116 (and/or an offline cache unit126), the user query and the associated query result may be stored as akey-value pair. The user query may correspond to a key, and the queryresult may correspond to a value. In some embodiments, the key-valuepair may be associated with a time for writing the key-value pair, so asto subsequently determine the data expiration period. In some otherembodiments, the user query and the associated query result may bestored in association in other ways.

In some embodiments, if a key-value pair corresponding to the user query102 exists in the cache unit 116, and a period between the timecorresponding to the user query 102 and the write time associated withthe key-value pair is less than the data expiration period, then thecache unit 116 may return the value in the key-value pair to theapplication 114. The value may contain, for example, the query result108 for the user query, which may include but is not limited to a file,a picture, a web link, and so on. If the key-value pair corresponding tothe user query 106 does not exist in the cache unit 116, or the periodbetween the time corresponding to the user query 102 and the write timeassociated with the key-value pair is greater than the data expirationperiod, the cache unit 116 may not return the corresponding query resultto the application 114. In other words, a query result being null may bereturned. In this case, the search system may further perform the aboveseries of processes to determine the query result in the storage unit,and update the cache unit 116 (for example, store or insert thekey-value pair into the cache unit 116) with the key-value pair for theuser query-query result at a predefined time. The predefined time maybe, for example, real-time, every predetermined period, or at a timewhen few user queries are conducted (for example, early in the morning).

The application 114 may be configured as an interface with the user, toreceive the user query 102 and return the corresponding query result108. In some embodiments, the application may generate keys for aplurality of elements associated with the user query 102. This may berealized, for example, by various signature algorithms. Examples ofsignature algorithms include but are not limited to MD5 algorithm, RSAalgorithm, DSS algorithm and SHA algorithm.

In the description of the embodiments of the present disclosure, theterm “element” refers to a query element used to match a correspondingresult in the cache unit and/or the storage unit. Examples of theelement include but are not limited to: a type of a terminal providingthe user query, location information for the terminal providing the userquery, a key phrase contained in the user query, a time that the userquery is conducted, a filtering condition contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, traffic tag(s)associated with the user query, an indicator indicating whether the userquery belongs to a stress testing, and so on. In some embodiments, eachuser query may be associated with the same number of elements. In somecases, some elements may not be contained in the user query. Forexample, in a case that the filtering condition is not set for the userquery, the corresponding element may be set to have a value of 0 so thatthe user query may contain the same number of elements to generate keyswith the same number of digits.

In addition to returning the query result 108 to the user, theapplication 114 may further transmit information about whether the queryresult 108 is found in the cache unit 116, to the computing device 105.For example, for a given query, if the corresponding query result isfound in the cache unit 116, it may be called a hit, and the application114 may, for example, transmit a first signal (for example, 1) to thecomputing device 105. If the corresponding query result is not found inthe cache unit 116, it may be called a miss, and the application 114may, for example, transmit a second signal (for example, 0) to thecomputing device 105.

The computing device 105 may determine the hit rate of the cache unit(sometimes referred to herein as a cache hit rate or a first hit rate)by using the first signal and the second signal. In the description ofthe embodiments of the present disclosure, unless explicitly stated tothe contrary, the term “hit rate” refers to the hit rate of an (forexample, online or offline) “cache unit” for a plurality of queries (forexample, the user query or a first query discussed below).

The computing device 105 may include a key element determination module104, an optimization module 106, an (optional) simulation module 124,and an (optional) offline cache unit 126. The simulation module 124corresponds to the application 114 in functions, and the offline cacheunit 126 corresponds to (for example, is the same as or similar to) thecache unit 116 in the data stored therein. By using the offline cacheunit and the simulation module, the key element may be determinedwithout affecting the user experience.

Because each user query in the plurality of user queries (for example, xuser queries including user query 1 to user query x, where x is apositive integer greater than 1) is associated with a plurality of queryelements (for example, n elements including element 1 to element n,where n is a positive integer greater than 1), the computing device 105may, for example, use the simulation module 124 to build x first queriesfor each element of the n query elements. Each of the x first queries isat least associated with the element. Then, the corresponding key (forexample, key 1 to key x) is determined by a signature algorithm. Forexample, for the element 1, a first query 1 to a first query xassociated with the element 1 may be generated, and the x first queriesare all associated with the element 1. For the element 2, a first query1 to a first query x associated with the element 2 may be generated, andthe x first queries are all associated with the element 2 and optionallyassociated with the element 1. Similarly, for the element i (i is apositive integer greater than 1 and less than or equivalent to n), afirst query 1 to a first query x associated with the element i may begenerated, and the x first queries are all associated with the element iand optionally associated with the element 1 to the element i-1.

The computing device 105 may determine whether the corresponding queryresults for the x first queries may be found in the offline cache unit126 or not. For example, the computing device may transmit x keys (forexample, key 1 to key x) corresponding to x first queries to the offlinecache unit 126 via the simulation module 124, receive the results (forexample, x values including value 1 to value x) returned by the offlinecache unit, and determine the second hit rate accordingly. It may beunderstood that when there is no corresponding key in the offline cacheunit 126, the returned value may be 0.

Similarly, for a given first query, if a corresponding query result isfound in the offline cache unit 126, it may be called a hit (which maybe identified by a first signal such as 1). If no corresponding queryresult is found in the offline cache unit 126, it may be called a miss(which may be identified by a second signal such as 0).

The key-value pair(s) in the offline cache unit 126 may be the same asthat (those) in the cache unit 116 so as to simulate a real cache unit.In order to further reduce the storage space occupied by the offlinecache unit without affecting the simulation of the real cache unit, thekey(s) in the key-value pair(s) in the offline cache unit 126 may be thesame as that (those) in the key-value pair(s) in the cache unit 116,while the value(s) in the key-value pair(s) in the offline cache unit126 may be set as a simple placeholder (for example, 1) to indicatewhether a result corresponding to the first query exists in the cacheunit or not.

In some embodiments, the user query may be forwarded (for example,asynchronously) by the application 114 to the simulation module 124 forprocessing. In other embodiments, the simulation module 124 may also beconfigured with an interface for receiving the user query 102.

In some embodiments, if the search system 115 has sufficient resourcesto process the query request (for example, a resource utilization islower than a predetermined value and/or the response time is less than apredetermined value, so that the implementation of the functions of thesimulation module 124 and the offline cache unit 126 does not affect theuser experience), the application 114 and the cache unit 116 may also beconfigured by the computing device to have the same functions as thesimulation module 124 and the offline cache unit 126. In theseembodiments, the simulation module 124 and the offline cache unit 126may also be omitted.

All signals indicating whether the user query and the first query hit ornot may be processed by the computing device 105 according to thesolution described in the present disclosure (for example, at the keyelement determination module 104) so as to determine one or more keyelements that have a great impact on the cache unit hit rate.

Based on the key elements determined, the computing device 105 mayoptimize the search system, for example, through the optimization module106, so as to stabilize or even improve the hit rate of the cache unit,thereby reducing the computing overhead of the search system andimproving the response speed for the user query. The optimizationoperation includes, but is not limited to, updating the key-value pairsstored in the cache unit, adjusting (for example, some or all) dataexpiration periods of the cache unit, adjusting the type and number ofthe elements associated with the user query (for example, when there isa new user query source or a new filtering condition), adjusting a dataelimination strategy in the cache unit, adjusting the data stored indifferent levels of cache, and so on, according to the key elementsdetermined.

It may be understood that the search system 115 and the computing devicein the exemplary environment 100 may process the user query in real timeand determine the key elements that affect the hit rate.

For clarity, the embodiments of the present disclosure will be describedbelow with reference to the environment 100 in FIG. 1. It should beunderstood that the embodiments of the present disclosure may furtherinclude additional actions not shown, and/or actions shown may beomitted. The scope of the present disclosure is not limited in thisrespect. For ease of understanding, the specific data mentioned in thefollowing description are all exemplary and are not used to limit theprotection scope of the present disclosure.

FIG. 2 shows a flowchart of a method 200 of optimizing a search systemaccording to some embodiments of the present disclosure. For example,the method 200 may be implemented by the computing device shown in FIG.1.

In step 202, the computing device 105 may determine the first hit rateof the cache unit of the search system 115 for a plurality of userqueries.

Specifically, as discussed above, because each user query in theplurality of user queries (for example, x user queries including userquery 1 to user query x, where x is a positive integer greater than 1)may be associated with a plurality of query elements (for example, nelements including element 1 to element n, where n is a positive integergreater than 1), the first hit rate may be affected by the plurality ofelements. In some embodiments, the plurality of elements may include butnot be limited to: a type of a terminal providing the user query,location information for the terminal providing the user query a keyphrase contained in the user query, a time when the user query isconducted, a filtering condition contained in the user query, a numberof pages of the query results corresponding to the user query, a numberof query result entries contained in each page of the query results,traffic tag(s) associated with the user query, an indicator indicatingwhether the user query belongs to a stress testing, and so on. It may beunderstood that each user query may be associated with a time forconducting the user query.

The first hit rate may be defined as a ratio of a first number ofqueries having respective query results retrievable in the cache unitamong a plurality of user queries to a total number of the plurality ofuser queries within a predetermined period of time (for example, 1minute, 10 seconds, 10 minutes, or any other suitable time period). Themethod of determining the first number will be discussed in detail belowwith reference to FIG. 5.

It may be understood that the first hit rate may be affected todifferent degrees by various elements, and at least one key elementaffecting the first hit rate may also change in a plurality ofpredetermined time periods. Therefore, for each element in a first setof elements in the plurality of elements, following steps (for example,step 204 to step 208) may be performed to determine at least one keyelement. In some embodiments, the first set of elements may include allof the elements, in other words, a complete set of the plurality ofelements.

In step 204, the computing device 105 may generate a plurality of firstqueries corresponding to the plurality of user queries.

The first query may be associated with at least the each element.Therefore, for example, for element 1, x first queries (for example,first query 1 to first query x, which are associated with the element 1)may be constructed, and the x first queries are only associated with theelement 1. In some embodiments, the first query may also be associatedwith more elements in the first set of elements. For example, forelement i, each of the plurality of first queries is associated with theelement i and the element 1 to element i-1. It may be understood thatthe x first queries associated with the each element listed above areonly examples, and the number of the elements included therein has anincreasing relationship with the serial number of the elements. In someother examples, these first queries may also be generated in adecreasing manner. For example, for element i, x first queriesassociated with the element i only may be generated, and for elementi-1, x first queries associated with both the element i and the elementi-1 may be generated. A generation of x first queries in any othermanner in that different elements may be distinguished separately isincluded in the scope of the present disclosure. In step 206, thecomputing device 105 may determine the second hit rate of the cache unitfor the plurality of first queries.

The second hit rate is a ratio of a second number of first querieshaving respective query results retrievable in the offline cache unitamong the plurality of first queries to a total number of the pluralityof first user queries within a predetermined period of time. The methodof determining the second number will be discussed in detail below withreference to FIG. 6.

As discussed above, in some embodiments, if the search system 115 hassufficient resources to process the query request (for example, aresource utilization of the search system 115 is lower than apredetermined value and/or the response time is less than apredetermined value, so that the implementation of the functions of thesimulation module 124 and the offline cache unit 126 does not affect theuser experience), the step 204 and the step 206 may be executed throughinstructing, by the computing device, the application 114 and the cacheunit 116 in the search system and acquiring a plurality of correspondingsignals from the search system 115 for analysis.

In other embodiments, the step 204 and the step 206 may be implementedentirely at the computing device 105 configured with, for example, thecorresponding simulation module 124 and offline caching unit 126.Therefore, the offline cache unit 126 corresponding to the cache unitmay be pre-constructed, and the computing device 105 may transmit theplurality of first queries to the offline cache unit and determine thehit rate of the offline cache unit for the plurality of first queries asthe second hit rate.

In some embodiments, the computing device 105 may generate the pluralityof first queries asynchronously with processing the plurality of userqueries by the search system 115. For example, the search system 115 mayfirstly process the user query to return query results, and thentransmit the user query to the simulation module so as to generate thefirst queries asynchronously. Each first query corresponds to anelement. Then, the simulation module may determine the corresponding keyfor the first query, for example, through the signature algorithm, andthen determine whether there is a corresponding key-value pair in theoffline cache unit, and return the value. It may be understood that whenthere is no corresponding key-value pair, the returned value may be null(for example, 0 or null).

In step 208, the computing device may determine whether a differencebetween the second hit rate and the first hit rate is less than adifference threshold or not. If the difference between the second hitrate and the first hit rate is less than the difference threshold, step210 is performed and the computing device 105 determines the element asone of the at least one key element. On the contrary, if the differencebetween the second hit rate and the first hit rate is not less than thedifference threshold, the computing device 105 does not determine theelement as one of the at least one key element.

In some embodiments, the difference threshold may be selected as a fixedvalue so that all key elements corresponding to a small difference aredetermined as the key elements. In some embodiments, at different timeperiods, the difference threshold (for example, to be used in a secondperiod later than the first period) may also be adjusted according tothe number of the at least one key element determined (for example, inthe first period).

In some embodiments, the above difference may be determined bydetermining, for example, a similarity between the hit rates (forexample, curves), which will be described in detail below with referenceto FIG. 3, FIG. 4 and FIG. 7. It may be understood that the step 204 tothe step 208 may be executed on all of the plurality of elements (forexample, n elements including element 1 to element n, where n is apositive integer greater than 1). The execution process may besynchronous or asynchronous, and the present disclosure is not limitedin this respect.

In some embodiments, the first set of elements for which the second hitrate is to be determined may also include just some elements of theplurality of elements, so as to exclude stable elements (for example,the type of the terminal on which the user query is conducted) in sometime periods, so that the processing efficiency of the computing device105 may be improved. In these embodiments, the first set of elements maybe selected with reference to the difference between the first hit rateand the second hit rate in history.

In step 212, the computing device 105 may optimize the search system 115based on the at least one key element.

Based on the at least one key element determined, the computing device105 may optimize the search system, for example, through theoptimization module 106, so as to stabilize or even improve the hit rateof the cache unit, thereby reducing the computing overhead of the searchsystem 115 and improving the response speed for the user query. Theoptimization operation includes, but is not limited to, updating thekey-value pairs stored in the cache unit, adjusting (for example, someor all) data expiration periods of the cache unit, adjusting the typeand number of the elements associated with the user query (for example,when there is a new user query source or a new filter condition),adjusting a data elimination strategy in the cache unit, adjusting thedata stored in different levels of cache, and so on, according to thekey elements determined.

According to embodiments of the present disclosure, the key elementsaffecting the search system may be determined efficiently, and theperformance of the search system may be optimized accordingly.

In some embodiments, in a case that a large number of at least one keyelements are determined, the computing device 105 may also determine apredetermined number of (for example, any of 1 to 5) key elements fromthe at least one key elements, for optimization of the search system. Inthis case, the computing device may sort the difference between thesecond hit rate and the first hit rate (for example, by the similaritydescribed below), and determine a predetermined number of key elementswith a minimum difference (for example, the highest similarity) foroptimization of the search system.

In this way, the change in the hit rate may be analyzed from a level ofthe elements forming the user query, so as to determine one or more keyelements that have a great impact on the cache hit rate in a case ofsignificant changes in the cache hit rate, and then optimize the searchsystem based on the key elements determined, so as to improve theperformance of the search system in various aspect such as the hit rateof the cache, the hit rate of the search system, the response time ofthe search system, the computing overhead of the search system, and soon.

FIG. 3 shows a flowchart of a method 300 of determining at least one keyelement according to some embodiments of the present disclosure.Specifically, the method 300 may be a specific process of the step 208in FIG. 2. The method 300 is a process for one element of the pluralityof elements, and it may be understood that the process may be performedon all of the plurality of elements.

In step 302, the computing device 105 may determine the similaritybetween the first hit rate and the second hit rate.

The similarity may be determined based on a variety of ways, forexample, calculated by using cosine angle, Euclidean distance, Pearsoncorrelation coefficient, and so on.

In some embodiments, the computing device 105 may draw a hit rate curveindicative of each element of the n elements in a plurality of timeperiods, and evaluate the similarity based on the hit rate curve.

Specifically, in the search system 115, some elements of the pluralityof elements in the user query are stable, and a distribution of theseelements in the time series changes little, so these elements havelittle impact on the cache hit rate. Other elements, such as a userquery text (for example, a query text related to breaking news events),have an impact on the cache hit rate that cannot be ignored. Indifferent periods, these key elements may be different. Therefore, atleast one key element of the plurality of elements in different timeperiods may be determined by drawing the hit rate curve in differenttime periods, and then the search system and/or the cache unit may beadjusted based on the key element.

A description is now given with reference to FIG. 4. FIG. 4 shows aschematic diagram 400 of hit rate curves according to some embodimentsof the present disclosure. The computing device 105 may draw a firstcurve 402 for the first hit rate, and second curve(s) 404, 406 for thesecond hit rate. As an example, FIG. 4 only shows the cache hit ratecurve 402 for the real user query, and the simulated hit rate curves 404and 406. The simulated hit rate curve 406 may, for example, indicate thesecond hit rate for the first query associated with at least the element1, and the simulated hit rate curve 404 may, for example, indicate thesecond hit rate for the first query associated with at least the element1 and the element 2. It may be understood that the computing device 105may also draw other curves (not shown) to determine the key elements.

For example, the computing device 105 may determine a total number ofthe user queries in a period from t1 to t2, and determine a first numberof the user queries corresponding to query results retrievable in thecache system in the period from t1 to t2 (the method of determining thefirst number will be discussed below in detail with reference to FIG.5), and determine a ratio of the first number to the total number of theuser queries so as to determine a hit rate 402-1. In a similar manner,the computing device 105 may also determine a hit rate 402-2 in a periodfrom t2 to t3, a hit rate 402-3 in a period from t3 to t4, a hit rate402-4 in a period from t4 to t5, a hit rate 402-5 in a period from t5 tot6, and so on. Based on these hit rate points, the computing device 105may draw a first curve 402 for the first hit rate.

For the first query associated with each query element (such as element1), the computing device 105 may further determine a total number of thefirst queries in the period from t1 to t2, and determine a second numberof user queries corresponding to query results retrievable in the cachesystem in the period from t1 to t2 (the method of determining the secondnumber will be discussed below in detail with reference to FIG. 6), anddetermine a ratio of the second number to the total number of the firstqueries so as to determine a hit rate 404-1. In a similar manner, thecomputing device 105 may also determine a hit rate 404-2 in a periodfrom t2 to t3, a hit rate 404-3 in a period from t3 to t4, a hit rate404-4 in a period from t4 to t5, a hit rate 404-5 from a period from t5to t6, and so on. Based on these hit rate points, the computing device105 may draw a second curve 404 of the second hit rate for the element1.

Then, the computing device 105 may determine the similarity based on aproximity of the first curve 402 to the second curves (404 and 406, forexample). This may be achieved, for example, by calculating a degree ofthe proximity of the curves. In some embodiments, the computing device105 may score according to a predetermined number of period pairs, so asto determine the key elements in the period.

Referring back to FIG. 3, in step 304, the computing device 105 maydetermine whether the similarity is higher than a similarity thresholdor not.

If the similarity is higher than the similarity threshold, step 306 isperformed and the computing device 105 determines the at least oneelement as one of the at least one key elements. If the similarity isnot greater than the similarity threshold, the computing device 105 maynot determine the at least one element as one of the at least one keyelements.

Specifically, the similarity threshold may be selected as a fixed valueso that all key elements corresponding to a high similarity aredetermined as the key elements. In some embodiments, at different timeperiods, the similarity threshold (for example, to be used in a secondperiod later than the first period) may also be adjusted according tothe number of the at least one key element determined (for example, inthe first period).

In some embodiments, the computing device may simultaneously draw thefirst curve and the plurality of second curves as shown in FIG. 7. FIG.7 shows another schematic diagram 700 of hit rate curves according tosome embodiments of the present disclosure. In FIG. 7, a curve 702 (thatis, a first curve) indicates a first hit rate, and a plurality of curves(that is, second curves) 704, 706 and 708 respectively indicate a cachehit rate 704 of the cache unit for a first query associated only withelement 1 (for example, a key phrase contained in the user query), acache hit rate 706 for the first query associated with the element 1 andthe element 2 (for example, the number of results per page and/or thenumber of pages), and a cache hit rate 708 for the first queryassociated with element 1, element 2 and element 3 (for example,elements associated with stress testing). It may be understood that theschematic diagram 700 may further include more second curves with thatthe elements may be distinguished separately. As intuitively shown, inthe period in which the cache hit rate significantly changes (around66000s), a change trend of the curve 702 for the first hit rate is mostsimilar to that of the curve 708, but not very similar to that of thecurve 704 and the curve 706. Therefore, the computing device maydetermine that the element 3 is the key element.

In some embodiments, the computing device may further derive the firstcurve and the plurality of second curves, and compare a first derivativevalue and a plurality of second derivative values at different times (orperiods). The computing device may determine at least one secondderivative value of the plurality of second derivative values that islittle different from the first derivative value, and determine at leastone corresponding second curve (and therefore at least one key element)accordingly.

In this way, the change in the hit rate may be analyzed simply andintuitively in a level of the elements forming the user query bydetermining the similarity of the hit rate based on the hit rate curves,so as to efficiently and accurately determine one or more key elementsthat have a great impact on the cache hit rate.

FIG. 5 shows a flowchart of a method 500 of determining a number of theuser queries hit in a cache unit according to some embodiments of thepresent disclosure.

As discussed above, in some embodiments, the cache unit 116 (or theoffline cache unit 126) may contain (for example, store) a plurality offirst key-value pairs (or a plurality of second key-value pairs). Insome embodiments, the value in the first key-value pair is the queryresult. Therefore, for each user query of the plurality of user queries,following steps may be performed to determine the first number of theuser queries that may be hit in the cache unit.

In step 502, the computing device 105 may generate a first key for eachuser query based on a plurality of elements contained in the each userquery, by using a signature algorithm.

Specifically, since the user query is associated with the plurality ofelements, the plurality of elements may be stitched together (forexample, in a predetermined order), and the signature algorithm may beapplied to the stitched plurality of elements, so as to determine thekey. For example, when the signature algorithm is MD5 algorithm, thefirst key of MD5 type may be calculated.

In step 504, the computing device 105 may search in a plurality of firstkey-value pairs based on the first key so as to determine the firstquery result.

For example, the computing device 105 may acquire, from a plurality offirst key-value pairs in the cache unit, a key-value pair with a keysame as or corresponding to the first key, and return the value. If akey-value pair with a key same as or corresponding to the second key isnot found, a value being null is returned.

The computing device 105 may determine the first number by counting thefirst query results that are not null. For example, in step 506, thecomputing device 105 may determine whether the first query result isnull or not. If so, step 508 is performed so that a count value M of thefirst number is incremented by 1. If not, step 510 is performed so thatthe count value M of the first number remains unchanged. In this way,the number of the user queries hit in the cache unit in a predeterminedperiod may be calculated. After the predetermined period, the countvalue of the first number may be zeroed so as to restart a count for anext predetermined period.

It may be understood that the first number may also be determined byother counting methods. For example, the first query results that arenot null in the predetermined period may be temporarily stored, andcounted at an end of the predetermined period.

FIG. 6 shows a flowchart of a method 600 of determining a number of thefirst queries hit in the cache unit according to some embodiments of thepresent disclosure.

As discussed above, the cache unit 116 may contain (for example, store)a plurality of first key-value pairs, and the offline cache unit 126 maycontain (for example, store) a plurality of second key-value pairs thatcorrespond to the plurality of first key-value pairs. In someembodiments, the value in the second key-value pair is a spaceholder.Therefore, for each user query of the plurality of user queries,following steps may be performed to determine the second number of thefirst queries that may be hit in the offline cache unit (or the cacheunit in some embodiments).

In step 602, the computing device 105 may generate a second key for eachfirst query based on a plurality of elements contained in the each firstquery, by using a signature algorithm.

Specifically, the first query may be associated with at least oneelement, and the at least one element is a subset of the first set ofelements. Therefore, the computing device 105 may calculate the at leastone element so as to determine a key. For example, when the signaturealgorithm is MD5 algorithm, the second key of MD5 type may becalculated.

In step 604, the computing device 105 may search in a plurality ofsecond key-value pairs based on the second key so as to determine thesecond query result.

For example, the computing device 105 may acquire, from a plurality ofsecond key-value pairs in the offline cache unit, a key-value pair witha key same as or corresponding to the second key, and return the value.If a key-value pair with a key same as or corresponding to the secondkey is not found, a value being null is returned.

The computing device 105 may determine the first number by counting thesecond query results that are not null. For example, in step 606, thecomputing device 105 may determine whether the second query result isnull or not. If so, step 608 is performed so that a count value P of thesecond number is incremented by 1. If not, step 610 is performed so thatthe count value P of the second number remains unchanged. In this way,the number of the first queries hit in the offline cache unit (or thecache unit in some embodiments) in a predetermined period may becounted. After the predetermined period, the count value of the secondnumber may be zeroed so as to restart a count for a next predeterminedperiod.

It may be understood that the second number may also be determined byother counting methods. For example, the first query results that arenot null in the predetermined period may be temporarily stored, andcounted at an end of the predetermined period.

FIG. 8 shows a block diagram of an apparatus 800 of optimizing a searchsystem according to some embodiments of the present disclosure.

The apparatus 800 may include a first hit rate determination module 802configured to determine a first hit rate of a cache unit of the searchsystem for a plurality of user queries. Each user query is associatedwith a plurality of elements. The apparatus 800 may further include akey element determination module 804 configured to, for each element ina first set of elements of the plurality of elements, determine at leastone key element by: generating a plurality of first queries thatcorrespond to the plurality of user queries and that are associated withat least the element; determining a second hit rate of the cache unitfor the plurality of first queries; and determining the element as oneof at least one key element, in response to determining that adifference between the second hit rate and the first hit rate is lessthan a difference threshold. The apparatus 800 may further include anoptimization module 802 configured to optimize the search system basedon the at least one key element.

In some embodiments, the offline cache unit corresponding to the cacheunit is pre-constructed. The key element determination module 804 isfurther configured to: transmit the plurality of first queries to theoffline cache unit; and determine a hit rate of the offline cache unitfor the plurality of first queries as the second hit rate.

In some embodiments, the key element determination module 804 is furtherconfigured to generate the plurality of first queries asynchronouslywith processing the plurality of user queries by the search system.

In some embodiments, the key element determination module 804 furtherincludes a similarity determination module configured to determine thesimilarity between the first hit rate and the second hit rate. The keyelement determination module 804 is further configured to determine atleast one element as one of the at least one key element in response todetermining that the similarity is higher than the similarity threshold.

In some embodiments, the similarity determination module is furtherconfigured to: draw the first curve for the first hit rate and thesecond curve for the second hit rate; and determine the similarity basedon a degree of proximity between the first curve and the second curve.

In some embodiments, the first hit rate is a ratio of a first number ofthe queries having respective query results retrievable in the cacheunit among the plurality of user queries to a total number of theplurality of user queries within a predetermined period of time, and thesecond hit rate is a ratio of a second number of the first querieshaving respective query results retrievable in the cache unit among theplurality of first queries to a total number of the plurality of firstqueries within a predetermined period of time.

In some embodiments, the cache unit contains a plurality of firstkey-value pairs, and the first hit rate determination module 802includes a first number determination module configured to: for eachuser query of the plurality of user queries, generate a first key foreach user query according to a plurality of elements contained in theeach user query by using a signature algorithm; search in the pluralityof first key-value pairs based on the first key, so as to determine afirst query result; and determine the first number by counting the firstquery results that are not null.

In some embodiments, the offline cache unit contains a plurality ofsecond key-value pairs corresponding to the plurality of first key-valuepairs, and the key element determination module 804 includes a secondnumber determination module configured to: for each first query of aplurality of first queries, generate a second key for each first queryaccording to a plurality of elements contained in the each first queryby using a signature algorithm; search in the plurality of secondkey-value pairs based on the second key, so as to determine a secondquery result; and determine the second number by counting the secondquery results that are not null.

In some embodiments, the value in the first key-value pair is aspaceholder, and the value in the second key-value pair is the queryresult.

In some embodiments, the plurality of elements may include but not belimited to: a type of a terminal providing the user query, locationinformation for the terminal providing the user query, a key phrasecontained in the user query, a filtering condition contained in the userquery, a number of the pages containing query results corresponding tothe user query, a number of the query result entries contained in eachof the pages containing query results corresponding to the user query,traffic tag(s) associated with the user query, a tag indicating whetherthe user query belongs to a stressing testing, and so on.

Collecting, storing, using, processing, transmitting, providing, anddisclosing etc. of the personal information of the user involved in thepresent disclosure all comply with the relevant laws and regulations,and do not violate the public order and morals.

According to the embodiments of the present disclosure, the presentdisclosure further provides an electronic device, a readable storagemedium, and a computer program product.

FIG. 9 shows a schematic block diagram of an exemplary electronic device900 for implementing the embodiments of the present disclosure. Theelectronic device is intended to represent various forms of digitalcomputers, such as a laptop computer, a desktop computer, a workstation,a personal digital assistant, a server, a blade server, a mainframecomputer, and other suitable computers. The electronic device mayfurther represent various forms of mobile devices, such as a personaldigital assistant, a cellular phone, a smart phone, a wearable device,and other similar computing devices. The components as illustratedherein, and connections, relationships, and functions thereof are merelyexamples, and are not intended to limit the implementation of thepresent disclosure described and/or required herein.

As shown in FIG. 9, the electronic device 900 may include a computingunit 901, which may perform various appropriate actions and processingbased on a computer program stored in a read-only memory (ROM) 902 or acomputer program loaded from a storage unit 908 into a random accessmemory (RAM) 903. Various programs and data required for the operationof the electronic device 900 may be stored in the RAM 903. The computingunit 901, the ROM 902 and the RAM 903 are connected to each otherthrough a bus 904. An input/output (I/O) interface 905 is also connectedto the bus 904.

Various components in the electronic device 900, including an input unit906 such as a keyboard, a mouse, etc., an output unit 907 such asvarious types of displays, speakers, etc., a storage unit 908 such as amagnetic disk, an optical disk, etc., and a communication unit 909 suchas a network card, a modem, a wireless communication transceiver, etc.,are connected to the I/O interface 905. The communication unit 909allows the electronic device 900 to exchange information/data with otherdevices through a computer network such as the Internet and/or varioustelecommunication networks.

The computing unit 901 may be various general-purpose and/orspecial-purpose processing components with processing and computingcapabilities. Some examples of the computing unit 901 include but arenot limited to a central processing unit (CPU), a graphics processingunit (GPU), various dedicated artificial intelligence (AI) computingchips, various computing units running machine learning modelalgorithms, a digital signal processor (DSP), and any appropriateprocessor, controller, microcontroller, and so on. The computing unit901 may perform the various methods and processes described above, suchas the method 200, the method 300, the method 500 and the method 600.For example, in some embodiments, any of the method 200, the method 300,the method 500 and the method 600 may be implemented as a computersoftware program that is tangibly contained on a machine-readablemedium, such as a storage unit 908. In some embodiments, part or all ofa computer program may be loaded and/or installed on the electronicdevice 900 via the ROM 902 and/or the communication unit 909. When acomputer program is loaded into the RAM 903 and executed by the CPU 901,one or more steps in any of the method 200, the method 300, the method500 and the method 600 described above may be executed. Alternatively,in other embodiments, the computing unit 901 may be configured toperform any of the method 200, the method 300, the method 500 and themethod 600 in any other appropriate way (for example, by means offirmware).

Various embodiments of the systems and technologies described herein maybe implemented in a digital electronic circuit system, an integratedcircuit system, a field programmable gate array (FPGA), an applicationspecific integrated circuit (ASIC), an application specific standardproduct (ASSP), a system on chip (SOC), a complex programmable logicdevice (CPLD), a computer hardware, firmware, software, and/orcombinations thereof. These various embodiments may be implemented byone or more computer programs executable and/or interpretable on aprogrammable system including at least one programmable processor. Theprogrammable processor may be a dedicated or general-purposeprogrammable processor, which may receive data and instructions from thestorage system, the at least one input device and the at least oneoutput device, and may transmit the data and instructions to the storagesystem, the at least one input device, and the at least one outputdevice.

Program codes for implementing the method of the present disclosure maybe written in any combination of one or more programming languages.These program codes may be provided to a processor or a controller of ageneral-purpose computer, a special-purpose computer, or otherprogrammable data processing devices, so that when the program codes areexecuted by the processor or the controller, the functions/operationsspecified in the flowchart and/or block diagram may be implemented. Theprogram codes may be executed completely on the machine, partly on themachine, partly on the machine and partly on the remote machine as anindependent software package, or completely on the remote machine or theserver.

In the context of the present disclosure, the machine readable mediummay be a tangible medium that may contain or store programs for use byor in combination with an instruction execution system, device orapparatus. The machine readable medium may be a machine-readable signalmedium or a machine-readable storage medium. The machine readable mediummay include, but not be limited to, electronic, magnetic, optical,electromagnetic, infrared or semiconductor systems, devices orapparatuses, or any suitable combination of the above. More specificexamples of the machine readable storage medium may include electricalconnections based on one or more wires, portable computer disks, harddisks, random access memory (RAM), read-only memory (ROM), erasableprogrammable read-only memory (EPROM or flash memory), optical fiber,convenient compact disk read-only memory (CD-ROM), optical storagedevice, magnetic storage device, or any suitable combination of theabove.

In order to provide interaction with users, the systems and techniquesdescribed here may be implemented on a computer including a displaydevice (for example, a CRT (cathode ray tube) or LCD (liquid crystaldisplay) monitor) for displaying information to the user), and akeyboard and a pointing device (for example, a mouse or a trackball)through which the user may provide the input to the computer. Othertypes of devices may also be used to provide interaction with users. Forexample, a feedback provided to the user may be any form of sensoryfeedback (for example, visual feedback, auditory feedback, or tactilefeedback), and the input from the user may be received in any form(including acoustic input, voice input or tactile input).

The systems and technologies described herein may be implemented in acomputing system including back-end components (for example, a dataserver), or a computing system including middleware components (forexample, an application server), or a computing system includingfront-end components (for example, a user computer having a graphicaluser interface or web browser through which the user may interact withthe implementation of the system and technology described herein), or acomputing system including any combination of such back-end components,middleware components or front-end components. The components of thesystem may be connected to each other by digital data communication (forexample, a communication network) in any form or through any medium.Examples of the communication network include a local area network(LAN), a wide area network (WAN), and Internet.

The computer system may include a client and a server. The client andthe server are generally far away from each other and usually interactthrough a communication network. The relationship between the client andthe server is generated through computer programs running on thecorresponding computers and having a client-server relationship witheach other. The server may be a cloud server, also known as a cloudcomputing server or a cloud host. It is a host product in the cloudcomputing service system to solve shortcomings of difficult managementand weak business scalability existing in the traditional physical hostand VPS (Virtual Private Server) service. The server may also be aserver of a distributed system or a server combined with a blockchain.

It should be understood that steps of the processes illustrated abovemay be reordered, added or deleted in various manners. For example, thesteps described in the present disclosure may be performed in parallel,sequentially, or in a different order, as long as a desired result ofthe technical solution of the present disclosure may be achieved. Thisis not limited in the present disclosure.

The above-mentioned specific embodiments do not constitute a limitationon the scope of protection of the present disclosure. Those skilled inthe art should understand that various modifications, combinations,sub-combinations and substitutions may be made according to designrequirements and other factors. Any modifications, equivalentreplacements and improvements made within the spirit and principles ofthe present disclosure shall be contained in the scope of protection ofthe present disclosure.

What is claimed is:
 1. A method of optimizing a search system,comprising: determining a first hit rate of a cache unit of the searchsystem for a plurality of user queries, wherein each user query isassociated with a plurality of elements; for each element in a first setof elements of the plurality of elements, determining at least one keyelement by: generating a plurality of first queries corresponding to theplurality of user queries, wherein the plurality of first queries areassociated with at least the element; determining a second hit rate ofthe cache unit for the plurality of first queries; and determining theelement as one of at least one key element, in response to determiningthat a difference between the second hit rate and the first hit rate isless than a difference threshold; and optimizing the search system basedon the at least one key element.
 2. The method of claim 1, wherein anoffline cache unit corresponding to the cache unit is pre-constructed,and wherein the determining a second hit rate of the cache unit for theplurality of first queries comprises: transmitting the plurality offirst queries to the offline cache unit; and determining a hit rate ofthe offline cache unit for the plurality of first queries as the secondhit rate.
 3. The method of claim 1, wherein the generating the pluralityof first queries comprises: generating the plurality of first queriesasynchronously with processing the plurality of user queries by thesearch system.
 4. The method of claim 1, wherein the determining theelement as the at least one key element comprises: determining asimilarity between the first hit rate and the second hit rate; anddetermining the at least one element as the at least one key element inresponse to determining that the similarity is greater than a similaritythreshold.
 5. The method of claim 4, wherein the determining asimilarity comprises: drawing a first curve for the first hit rate and asecond curve for the second hit rate; and determining the similaritybased on a degree of proximity between the first curve and the secondcurve.
 6. The method of claim 2, wherein the first hit rate is a ratioof a first number of user queries having respective query resultsretrievable in the cache unit among the plurality of user queries to atotal number of the plurality of user queries within a predeterminedperiod of time; and wherein the second hit rate is a ratio of a secondnumber of first queries having respective query results retrievable inthe offline cache unit among the plurality of first queries to a totalnumber of the plurality of first user queries within the predeterminedperiod of time.
 7. The method of claim 6, wherein the cache unitcontains a plurality of first key-value pairs, and the determining afirst number comprises: for each user query of the plurality of userqueries: generating a first key for each user query according to theplurality of elements contained in the each user query, by using asignature algorithm; and searching in the plurality of first key-valuepairs based on the first key, so as to determine a first query result;and determining the first number by counting the first query resultsthat are not null.
 8. The method of claim 7, wherein the offline cacheunit contains a plurality of second key-value pairs corresponding to thefirst key-value pairs, and the determining the second number comprises:for each first user query of the plurality of first user queries:generating a second key for each first query according to the elementscontained in the each first query, by using a signature algorithm; andsearching in the plurality of second key-value pairs based on the secondkey, so as to determine a second query result; and determining thesecond number by counting the second query results that are not null. 9.The method of claim 8, wherein a value in the first key-value pair is aquery result, and a value in the second key-value pair is a placeholder.10. The method of claim 1, wherein the plurality of elements comprise atleast one of a type of a terminal providing the user query, locationinformation for the terminal providing the user query, a key phrasecontained in the user query, a number of pages containing query resultscorresponding to the user query, a number of query result entriescontained in each of the pages containing query results corresponding tothe user query, a traffic tag associated with the user query, and a tagindicating whether the user query is a stress testing.
 11. The method ofclaim 2, wherein the plurality of elements comprise at least one of atype of a terminal providing the user query, location information forthe terminal providing the user query, a key phrase contained in theuser query, a number of pages containing query results corresponding tothe user query, a number of query result entries contained in each ofthe pages containing query results corresponding to the user query, atraffic tag associated with the user query, and a tag indicating whetherthe user query is a stress testing.
 12. The method of claim 3, whereinthe plurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 13. The method of claim 4, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 14. The method of claim 5, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 15. The method of claim 6, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 16. The method of claim 7, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 17. The method of claim 8, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 18. The method of claim 9, wherein theplurality of elements comprise at least one of a type of a terminalproviding the user query, location information for the terminalproviding the user query, a key phrase contained in the user query, anumber of pages containing query results corresponding to the userquery, a number of query result entries contained in each of the pagescontaining query results corresponding to the user query, a traffic tagassociated with the user query, and a tag indicating whether the userquery is a stress testing.
 19. An electronic device, comprising: atleast one processor; and a memory communicatively connected to the atleast one processor, wherein the memory stores instructions executableby the at least one processor, and the instructions, when executed bythe at least one processor, cause the at least one processor toimplement the method of claim
 1. 20. A non-transitory computer-readablestorage medium having computer instructions stored thereon, wherein thecomputer instructions allow a computer to implement the method of claim1.